A Wavelet Based Recognition System for Printed Malayalam Characters
نویسندگان
چکیده
This paper specifies an OCR system for printed Malayalam characters. Malayalam is the principal language of the South Indian state Kerala. It belongs to the family of Dravidian Language. The input to the system would be the scanned image of a page of text and the output is a machine editable file. Malayalam Character recognition is a complex task because of the presence of two scripts; old script and new script and a lot of combinational characters. Initially, the image is preprocessed to remove noise. Then skew correction methods are applied to the document. Lines, words and characters are segmented from the processed document image. The proposed method uses wavelet analysis for extracting features of the image and Back propagation neural network is used to accomplish the recognition tasks.
منابع مشابه
An Online Character Recognition System to Convert Grantha Script to Malayalam
This paper presents a novel approach to recognize Grantha, an ancient script in South India and converting it to Malayalam, a prevalent language in South India using online character recognition mechanism. The motivation behind this work owes its credit to (i) developing a mechanism to recognize Grantha script in this modern world and (ii) affirming the strong connection among Grantha and Malay...
متن کاملAn Efficient OCR for Printed Malayalam Text using Novel Segmentation Algorithm and SVM Classifiers
This paper describes an Optical Character Recognition (OCR) System for printed text documents in Malayalam, a South Indian language. Indian scripts are rich in patterns while the combinations of such patterns makes the problem even more complex and these complex patterns are exploited to arrive at the solution. The system segments the scanned document image into text lines, words and further ch...
متن کاملOff-line Handwritten Malayalam Character Recognition Using Gabor Filters
Handwritten character recognition is the ability of a Computer to receive and interpret handwritten input. Computers may find difficultly in deciphering many kinds of printed characters which is of different fonts and styles or handwritten characters. This paper focuses on the recognition of handwritten Malayalam characters. The proposed system consists of image acquisition, preprocessing, segm...
متن کاملSpectral Analysis of Projection Histogram for Enhancing Close matching character Recognition in Malayalam
The success rates of Optical Character Recognition (OCR) systems for printed Malayalam documents is quite impressive with the state of the art accuracy levels in the range of 85-95% for various. However for real applications, further enhancement of this accuracy levels are required. One of the bottle necks in further enhancement of the accuracy is identified as close-matching characters. In thi...
متن کاملWavepackets in the Recognition of Isolated Handwritten Characters
for the recognition of isolated handwritten Malayalam (one of the south Indian languages) characters. The key idea is that count of zero crossings of wavelet transform coefficients of an image characterize it. A set of 3000 images of 20 selected characters are used for classification. All images are normalized to have same height, binarized and inverted. Two-level Wavelet packet transformation ...
متن کامل